Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 5695 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 623.0 KiB |
| Average record size in memory | 112.0 B |
Variable types
| Numeric | 14 |
|---|
revenue is highly correlated with quantity_orders and 3 other fields | High correlation |
recency is highly correlated with quantity_orders and 2 other fields | High correlation |
quantity_orders is highly correlated with revenue and 7 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 4 other fields | High correlation |
avg_ticket is highly correlated with revenue and 3 other fields | High correlation |
avg_recency is highly correlated with recency and 2 other fields | High correlation |
frequency is highly correlated with recency and 2 other fields | High correlation |
frequency_btwn_purchases is highly correlated with quantity_orders and 1 other fields | High correlation |
avg_basket_size is highly correlated with revenue and 3 other fields | High correlation |
avg_unique_basked_size is highly correlated with avg_ticket and 1 other fields | High correlation |
quantity_items_returned is highly correlated with quantity_orders and 1 other fields | High correlation |
monetary_returned is highly correlated with quantity_orders and 1 other fields | High correlation |
revenue is highly correlated with quantity_orders and 1 other fields | High correlation |
recency is highly correlated with avg_recency | High correlation |
quantity_orders is highly correlated with revenue and 1 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 1 other fields | High correlation |
avg_ticket is highly correlated with avg_basket_size and 2 other fields | High correlation |
avg_recency is highly correlated with recency | High correlation |
avg_basket_size is highly correlated with avg_ticket and 2 other fields | High correlation |
quantity_items_returned is highly correlated with avg_ticket and 2 other fields | High correlation |
monetary_returned is highly correlated with avg_ticket and 2 other fields | High correlation |
revenue is highly correlated with quantity_orders and 3 other fields | High correlation |
recency is highly correlated with avg_recency and 1 other fields | High correlation |
quantity_orders is highly correlated with revenue and 2 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 2 other fields | High correlation |
avg_ticket is highly correlated with revenue and 1 other fields | High correlation |
avg_recency is highly correlated with recency and 1 other fields | High correlation |
frequency is highly correlated with recency and 1 other fields | High correlation |
frequency_btwn_purchases is highly correlated with quantity_orders | High correlation |
avg_basket_size is highly correlated with revenue and 2 other fields | High correlation |
quantity_items_returned is highly correlated with monetary_returned | High correlation |
monetary_returned is highly correlated with quantity_items_returned | High correlation |
customer_id is highly correlated with recency and 2 other fields | High correlation |
revenue is highly correlated with quantity_orders and 5 other fields | High correlation |
recency is highly correlated with customer_id and 2 other fields | High correlation |
quantity_orders is highly correlated with revenue and 2 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 4 other fields | High correlation |
avg_ticket is highly correlated with avg_basket_size and 2 other fields | High correlation |
avg_recency is highly correlated with customer_id and 2 other fields | High correlation |
time_in_base is highly correlated with customer_id and 2 other fields | High correlation |
frequency is highly correlated with revenue and 1 other fields | High correlation |
avg_basket_size is highly correlated with revenue and 4 other fields | High correlation |
quantity_items_returned is highly correlated with revenue and 4 other fields | High correlation |
monetary_returned is highly correlated with revenue and 4 other fields | High correlation |
revenue is highly skewed (γ1 = 21.62884637) | Skewed |
quantity_items_purchased is highly skewed (γ1 = 23.05598553) | Skewed |
avg_ticket is highly skewed (γ1 = 27.82015631) | Skewed |
avg_basket_size is highly skewed (γ1 = 48.53682353) | Skewed |
quantity_items_returned is highly skewed (γ1 = 51.5242843) | Skewed |
monetary_returned is highly skewed (γ1 = 59.48544078) | Skewed |
customer_id has unique values | Unique |
quantity_items_returned has 4190 (73.6%) zeros | Zeros |
monetary_returned has 4190 (73.6%) zeros | Zeros |
Reproduction
| Analysis started | 2022-04-20 14:13:48.520445 |
|---|---|
| Analysis finished | 2022-04-20 14:14:45.800455 |
| Duration | 57.28 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 5695 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16600.70834 |
| Minimum | 12346 |
|---|---|
| Maximum | 22709 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 12346 |
|---|---|
| 5-th percentile | 12699.1 |
| Q1 | 14288.5 |
| median | 16229 |
| Q3 | 18210.5 |
| 95-th percentile | 21731.1 |
| Maximum | 22709 |
| Range | 10363 |
| Interquartile range (IQR) | 3922 |
Descriptive statistics
| Standard deviation | 2808.223729 |
|---|---|
| Coefficient of variation (CV) | 0.1691628858 |
| Kurtosis | -0.8211293405 |
| Mean | 16600.70834 |
| Median Absolute Deviation (MAD) | 1962 |
| Skewness | 0.441165902 |
| Sum | 94541034 |
| Variance | 7886120.514 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17850 | 1 | < 0.1% |
| 12572 | 1 | < 0.1% |
| 17534 | 1 | < 0.1% |
| 17205 | 1 | < 0.1% |
| 16412 | 1 | < 0.1% |
| 13923 | 1 | < 0.1% |
| 17520 | 1 | < 0.1% |
| 17201 | 1 | < 0.1% |
| 16563 | 1 | < 0.1% |
| 18042 | 1 | < 0.1% |
| Other values (5685) | 5685 |
| Value | Count | Frequency (%) |
| 12346 | 1 | |
| 12347 | 1 | |
| 12348 | 1 | |
| 12349 | 1 | |
| 12350 | 1 | |
| 12352 | 1 | |
| 12353 | 1 | |
| 12354 | 1 | |
| 12355 | 1 | |
| 12356 | 1 |
| Value | Count | Frequency (%) |
| 22709 | 1 | |
| 22708 | 1 | |
| 22707 | 1 | |
| 22706 | 1 | |
| 22705 | 1 | |
| 22704 | 1 | |
| 22700 | 1 | |
| 22699 | 1 | |
| 22696 | 1 | |
| 22695 | 1 |
| Distinct | 5449 |
|---|---|
| Distinct (%) | 95.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1803.857041 |
| Minimum | 0.42 |
|---|---|
| Maximum | 279138.02 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.42 |
|---|---|
| 5-th percentile | 13.171 |
| Q1 | 236.24 |
| median | 614.66 |
| Q3 | 1571.11 |
| 95-th percentile | 5323.416 |
| Maximum | 279138.02 |
| Range | 279137.6 |
| Interquartile range (IQR) | 1334.87 |
Descriptive statistics
| Standard deviation | 7897.383597 |
|---|---|
| Coefficient of variation (CV) | 4.378054035 |
| Kurtosis | 608.1754389 |
| Mean | 1803.857041 |
| Median Absolute Deviation (MAD) | 480.18 |
| Skewness | 21.62884637 |
| Sum | 10272965.85 |
| Variance | 62368667.68 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.95 | 9 | 0.2% |
| 1.25 | 8 | 0.1% |
| 4.95 | 8 | 0.1% |
| 2.95 | 8 | 0.1% |
| 1.65 | 7 | 0.1% |
| 3.75 | 7 | 0.1% |
| 12.75 | 7 | 0.1% |
| 7.5 | 6 | 0.1% |
| 4.25 | 6 | 0.1% |
| 5.95 | 6 | 0.1% |
| Other values (5439) | 5623 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | < 0.1% |
| 0.65 | 1 | < 0.1% |
| 0.79 | 1 | < 0.1% |
| 0.84 | 4 | |
| 0.85 | 3 | 0.1% |
| 1.07 | 1 | < 0.1% |
| 1.25 | 8 | |
| 1.44 | 1 | < 0.1% |
| 1.65 | 7 | |
| 1.69 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 279138.02 | 1 | |
| 259657.3 | 1 | |
| 194550.79 | 1 | |
| 168472.5 | 1 | |
| 140450.72 | 1 | |
| 124564.53 | 1 | |
| 117379.63 | 1 | |
| 91062.38 | 1 | |
| 77183.6 | 1 | |
| 72882.09 | 1 |
| Distinct | 304 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 116.9069359 |
| Minimum | 0 |
|---|---|
| Maximum | 373 |
| Zeros | 38 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 22.5 |
| median | 71 |
| Q3 | 200 |
| 95-th percentile | 338 |
| Maximum | 373 |
| Range | 373 |
| Interquartile range (IQR) | 177.5 |
Descriptive statistics
| Standard deviation | 111.6299008 |
|---|---|
| Coefficient of variation (CV) | 0.9548612315 |
| Kurtosis | -0.643576286 |
| Mean | 116.9069359 |
| Median Absolute Deviation (MAD) | 61 |
| Skewness | 0.8140075817 |
| Sum | 665785 |
| Variance | 12461.23475 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 110 | 1.9% |
| 4 | 105 | 1.8% |
| 3 | 98 | 1.7% |
| 2 | 91 | 1.6% |
| 10 | 86 | 1.5% |
| 8 | 82 | 1.4% |
| 17 | 79 | 1.4% |
| 9 | 79 | 1.4% |
| 7 | 78 | 1.4% |
| 15 | 67 | 1.2% |
| Other values (294) | 4820 |
| Value | Count | Frequency (%) |
| 0 | 38 | 0.7% |
| 1 | 110 | |
| 2 | 91 | |
| 3 | 98 | |
| 4 | 105 | |
| 5 | 52 | |
| 7 | 78 | |
| 8 | 82 | |
| 9 | 79 | |
| 10 | 86 |
| Value | Count | Frequency (%) |
| 373 | 23 | |
| 372 | 22 | |
| 371 | 17 | |
| 369 | 4 | 0.1% |
| 368 | 13 | |
| 367 | 16 | |
| 366 | 15 | |
| 365 | 19 | |
| 364 | 11 | |
| 362 | 7 | 0.1% |
quantity_orders
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 56 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.469710272 |
| Minimum | 1 |
|---|---|
| Maximum | 206 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 4 |
| 95-th percentile | 11 |
| Maximum | 206 |
| Range | 205 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 6.809445663 |
|---|---|
| Coefficient of variation (CV) | 1.962540134 |
| Kurtosis | 302.566861 |
| Mean | 3.469710272 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 13.20109159 |
| Sum | 19760 |
| Variance | 46.36855023 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2871 | |
| 2 | 827 | 14.5% |
| 3 | 501 | 8.8% |
| 4 | 395 | 6.9% |
| 5 | 236 | 4.1% |
| 6 | 173 | 3.0% |
| 7 | 139 | 2.4% |
| 8 | 98 | 1.7% |
| 9 | 68 | 1.2% |
| 10 | 55 | 1.0% |
| Other values (46) | 332 | 5.8% |
| Value | Count | Frequency (%) |
| 1 | 2871 | |
| 2 | 827 | 14.5% |
| 3 | 501 | 8.8% |
| 4 | 395 | 6.9% |
| 5 | 236 | 4.1% |
| 6 | 173 | 3.0% |
| 7 | 139 | 2.4% |
| 8 | 98 | 1.7% |
| 9 | 68 | 1.2% |
| 10 | 55 | 1.0% |
| Value | Count | Frequency (%) |
| 206 | 1 | |
| 199 | 1 | |
| 124 | 1 | |
| 97 | 1 | |
| 91 | 1 | |
| 90 | 1 | |
| 86 | 1 | |
| 72 | 1 | |
| 62 | 2 | |
| 60 | 1 |
quantity_items_purchased
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 1842 |
|---|---|
| Distinct (%) | 32.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 978.6463565 |
| Minimum | 1 |
|---|---|
| Maximum | 196844 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 106 |
| median | 317 |
| Q3 | 805.5 |
| 95-th percentile | 2943.3 |
| Maximum | 196844 |
| Range | 196843 |
| Interquartile range (IQR) | 699.5 |
Descriptive statistics
| Standard deviation | 4429.032218 |
|---|---|
| Coefficient of variation (CV) | 4.525671801 |
| Kurtosis | 785.3589653 |
| Mean | 978.6463565 |
| Median Absolute Deviation (MAD) | 253 |
| Skewness | 23.05598553 |
| Sum | 5573391 |
| Variance | 19616326.39 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 113 | 2.0% |
| 2 | 73 | 1.3% |
| 3 | 51 | 0.9% |
| 4 | 49 | 0.9% |
| 5 | 35 | 0.6% |
| 6 | 29 | 0.5% |
| 12 | 25 | 0.4% |
| 88 | 22 | 0.4% |
| 72 | 21 | 0.4% |
| 7 | 20 | 0.4% |
| Other values (1832) | 5257 |
| Value | Count | Frequency (%) |
| 1 | 113 | |
| 2 | 73 | |
| 3 | 51 | |
| 4 | 49 | |
| 5 | 35 | 0.6% |
| 6 | 29 | 0.5% |
| 7 | 20 | 0.4% |
| 8 | 18 | 0.3% |
| 9 | 7 | 0.1% |
| 10 | 17 | 0.3% |
| Value | Count | Frequency (%) |
| 196844 | 1 | |
| 80997 | 1 | |
| 80263 | 1 | |
| 77373 | 1 | |
| 74215 | 1 | |
| 69993 | 1 | |
| 64549 | 1 | |
| 64124 | 1 | |
| 63312 | 1 | |
| 58343 | 1 |
avg_ticket
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 5454 |
|---|---|
| Distinct (%) | 95.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 582.1710525 |
| Minimum | 0.42 |
|---|---|
| Maximum | 84236.25 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.42 |
|---|---|
| 5-th percentile | 12.835 |
| Q1 | 158.975 |
| median | 297.56 |
| Q3 | 486.8193956 |
| 95-th percentile | 1842.344 |
| Maximum | 84236.25 |
| Range | 84235.83 |
| Interquartile range (IQR) | 327.8443956 |
Descriptive statistics
| Standard deviation | 2040.79593 |
|---|---|
| Coefficient of variation (CV) | 3.505491936 |
| Kurtosis | 987.7715332 |
| Mean | 582.1710525 |
| Median Absolute Deviation (MAD) | 152.36 |
| Skewness | 27.82015631 |
| Sum | 3315464.144 |
| Variance | 4164848.027 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.95 | 9 | 0.2% |
| 1.25 | 8 | 0.1% |
| 2.95 | 8 | 0.1% |
| 4.95 | 8 | 0.1% |
| 12.75 | 7 | 0.1% |
| 3.75 | 7 | 0.1% |
| 1.65 | 7 | 0.1% |
| 4.25 | 6 | 0.1% |
| 5.95 | 6 | 0.1% |
| 7.5 | 6 | 0.1% |
| Other values (5444) | 5623 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | < 0.1% |
| 0.65 | 1 | < 0.1% |
| 0.79 | 1 | < 0.1% |
| 0.84 | 4 | |
| 0.85 | 3 | 0.1% |
| 1.07 | 1 | < 0.1% |
| 1.25 | 8 | |
| 1.44 | 1 | < 0.1% |
| 1.65 | 7 | |
| 1.69 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 84236.25 | 1 | |
| 77183.6 | 1 | |
| 52940.94 | 1 | |
| 50653.91 | 1 | |
| 21389.6 | 1 | |
| 18745.86 | 1 | |
| 14855.53 | 1 | |
| 14844.76667 | 1 | |
| 13305.5 | 1 | |
| 12681.58 | 1 |
| Distinct | 1181 |
|---|---|
| Distinct (%) | 20.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 124.0251413 |
| Minimum | 0 |
|---|---|
| Maximum | 373 |
| Zeros | 4 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 44.125 |
| median | 86 |
| Q3 | 184 |
| 95-th percentile | 336.3 |
| Maximum | 373 |
| Range | 373 |
| Interquartile range (IQR) | 139.875 |
Descriptive statistics
| Standard deviation | 101.8129872 |
|---|---|
| Coefficient of variation (CV) | 0.8209060368 |
| Kurtosis | -0.2554173381 |
| Mean | 124.0251413 |
| Median Absolute Deviation (MAD) | 55.33333333 |
| Skewness | 0.9372147346 |
| Sum | 706323.1796 |
| Variance | 10365.88436 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 60 | 32 | 0.6% |
| 53 | 31 | 0.5% |
| 213 | 30 | 0.5% |
| 353 | 30 | 0.5% |
| 184 | 29 | 0.5% |
| 46 | 28 | 0.5% |
| 64 | 27 | 0.5% |
| 28 | 27 | 0.5% |
| 77 | 26 | 0.5% |
| 154 | 25 | 0.4% |
| Other values (1171) | 5410 |
| Value | Count | Frequency (%) |
| 0 | 4 | 0.1% |
| 1 | 11 | |
| 2 | 7 | 0.1% |
| 2.847328244 | 1 | < 0.1% |
| 3 | 13 | |
| 3.300884956 | 1 | < 0.1% |
| 3.330357143 | 1 | < 0.1% |
| 3.333333333 | 1 | < 0.1% |
| 4 | 18 | |
| 4.144444444 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 373 | 23 | |
| 372 | 21 | |
| 371 | 17 | |
| 369 | 4 | 0.1% |
| 368 | 13 | |
| 367 | 16 | |
| 366 | 14 | |
| 365 | 19 | |
| 364 | 11 | |
| 362 | 7 | 0.1% |
| Distinct | 305 |
|---|---|
| Distinct (%) | 5.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 217.2491659 |
| Minimum | 1 |
|---|---|
| Maximum | 374 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 25 |
| Q1 | 110 |
| median | 239 |
| Q3 | 319 |
| 95-th percentile | 370 |
| Maximum | 374 |
| Range | 373 |
| Interquartile range (IQR) | 209 |
Descriptive statistics
| Standard deviation | 116.5840834 |
|---|---|
| Coefficient of variation (CV) | 0.5366376572 |
| Kurtosis | -1.233603755 |
| Mean | 217.2491659 |
| Median Absolute Deviation (MAD) | 96 |
| Skewness | -0.2947870121 |
| Sum | 1237234 |
| Variance | 13591.84851 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 374 | 101 | 1.8% |
| 373 | 97 | 1.7% |
| 367 | 88 | 1.5% |
| 369 | 78 | 1.4% |
| 366 | 76 | 1.3% |
| 370 | 70 | 1.2% |
| 359 | 66 | 1.2% |
| 368 | 61 | 1.1% |
| 372 | 57 | 1.0% |
| 360 | 46 | 0.8% |
| Other values (295) | 4955 |
| Value | Count | Frequency (%) |
| 1 | 4 | 0.1% |
| 2 | 11 | |
| 3 | 7 | 0.1% |
| 4 | 13 | |
| 5 | 18 | |
| 6 | 9 | |
| 8 | 13 | |
| 9 | 6 | 0.1% |
| 10 | 14 | |
| 11 | 21 |
| Value | Count | Frequency (%) |
| 374 | 101 | |
| 373 | 97 | |
| 372 | 57 | |
| 370 | 70 | |
| 369 | 78 | |
| 368 | 61 | |
| 367 | 88 | |
| 366 | 76 | |
| 365 | 45 | |
| 363 | 32 | 0.6% |
| Distinct | 1222 |
|---|---|
| Distinct (%) | 21.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.02306421228 |
| Minimum | 0.002673796791 |
|---|---|
| Maximum | 1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.002673796791 |
|---|---|
| 5-th percentile | 0.00296735905 |
| Q1 | 0.005449591281 |
| median | 0.01201201201 |
| Q3 | 0.02379800602 |
| 95-th percentile | 0.06879962081 |
| Maximum | 1 |
| Range | 0.9973262032 |
| Interquartile range (IQR) | 0.01834841474 |
Descriptive statistics
| Standard deviation | 0.04828311569 |
|---|---|
| Coefficient of variation (CV) | 2.093421405 |
| Kurtosis | 173.325764 |
| Mean | 0.02306421228 |
| Median Absolute Deviation (MAD) | 0.00750018311 |
| Skewness | 10.79745302 |
| Sum | 131.3506889 |
| Variance | 0.00233125926 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.01851851852 | 37 | 0.6% |
| 0.005405405405 | 32 | 0.6% |
| 0.01639344262 | 31 | 0.5% |
| 0.002824858757 | 30 | 0.5% |
| 0.004672897196 | 30 | 0.5% |
| 0.01538461538 | 29 | 0.5% |
| 0.05263157895 | 29 | 0.5% |
| 0.01923076923 | 28 | 0.5% |
| 0.025 | 27 | 0.5% |
| 0.04545454545 | 26 | 0.5% |
| Other values (1212) | 5396 |
| Value | Count | Frequency (%) |
| 0.002673796791 | 22 | |
| 0.002680965147 | 21 | |
| 0.002688172043 | 17 | |
| 0.002702702703 | 3 | 0.1% |
| 0.0027100271 | 13 | |
| 0.002717391304 | 16 | |
| 0.00272479564 | 14 | |
| 0.002732240437 | 19 | |
| 0.002739726027 | 11 | |
| 0.002754820937 | 7 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 5 | |
| 0.550802139 | 1 | < 0.1% |
| 0.5320855615 | 1 | < 0.1% |
| 0.5 | 11 | |
| 0.4 | 1 | < 0.1% |
| 0.3333333333 | 6 | |
| 0.3315508021 | 1 | < 0.1% |
| 0.3157894737 | 1 | < 0.1% |
| 0.2727272727 | 2 | < 0.1% |
| 0.2621621622 | 1 | < 0.1% |
| Distinct | 1225 |
|---|---|
| Distinct (%) | 21.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5475706259 |
| Minimum | 0.005449591281 |
|---|---|
| Maximum | 17 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.005449591281 |
|---|---|
| 5-th percentile | 0.01102941176 |
| Q1 | 0.02492211838 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 17 |
| Range | 16.99455041 |
| Interquartile range (IQR) | 0.9750778816 |
Descriptive statistics
| Standard deviation | 0.5505967909 |
|---|---|
| Coefficient of variation (CV) | 1.005526529 |
| Kurtosis | 138.7856997 |
| Mean | 0.5475706259 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.851371477 |
| Sum | 3118.414715 |
| Variance | 0.3031568261 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2879 | |
| 2 | 48 | 0.8% |
| 0.0625 | 18 | 0.3% |
| 0.02777777778 | 17 | 0.3% |
| 0.02380952381 | 16 | 0.3% |
| 0.09090909091 | 15 | 0.3% |
| 0.08333333333 | 15 | 0.3% |
| 0.03448275862 | 14 | 0.2% |
| 0.02941176471 | 14 | 0.2% |
| 0.01923076923 | 13 | 0.2% |
| Other values (1215) | 2646 |
| Value | Count | Frequency (%) |
| 0.005449591281 | 1 | < 0.1% |
| 0.005464480874 | 1 | < 0.1% |
| 0.005479452055 | 1 | < 0.1% |
| 0.005494505495 | 1 | < 0.1% |
| 0.005586592179 | 2 | |
| 0.005602240896 | 1 | < 0.1% |
| 0.005617977528 | 2 | |
| 0.00566572238 | 1 | < 0.1% |
| 0.005681818182 | 2 | |
| 0.005698005698 | 3 |
| Value | Count | Frequency (%) |
| 17 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 3 | 5 | 0.1% |
| 2 | 48 | 0.8% |
| 1.142857143 | 1 | < 0.1% |
| 1 | 2879 | |
| 0.75 | 1 | < 0.1% |
| 0.6666666667 | 3 | 0.1% |
| 0.550802139 | 1 | < 0.1% |
| 0.5335120643 | 1 | < 0.1% |
avg_basket_size
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 2371 |
|---|---|
| Distinct (%) | 41.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 268.271079 |
| Minimum | 1 |
|---|---|
| Maximum | 74215 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 75 |
| median | 152 |
| Q3 | 290.7083333 |
| 95-th percentile | 734.3 |
| Maximum | 74215 |
| Range | 74214 |
| Interquartile range (IQR) | 215.7083333 |
Descriptive statistics
| Standard deviation | 1199.192546 |
|---|---|
| Coefficient of variation (CV) | 4.470077617 |
| Kurtosis | 2768.431965 |
| Mean | 268.271079 |
| Median Absolute Deviation (MAD) | 97 |
| Skewness | 48.53682353 |
| Sum | 1527803.795 |
| Variance | 1438062.761 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 114 | 2.0% |
| 2 | 72 | 1.3% |
| 3 | 51 | 0.9% |
| 4 | 49 | 0.9% |
| 5 | 35 | 0.6% |
| 6 | 29 | 0.5% |
| 12 | 26 | 0.5% |
| 72 | 22 | 0.4% |
| 100 | 22 | 0.4% |
| 88 | 21 | 0.4% |
| Other values (2361) | 5254 |
| Value | Count | Frequency (%) |
| 1 | 114 | |
| 2 | 72 | |
| 3 | 51 | |
| 3.333333333 | 1 | < 0.1% |
| 4 | 49 | |
| 5 | 35 | 0.6% |
| 5.333333333 | 1 | < 0.1% |
| 5.666666667 | 1 | < 0.1% |
| 6 | 29 | 0.5% |
| 6.142857143 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 74215 | 1 | |
| 40498.5 | 1 | |
| 14149 | 1 | |
| 13956 | 1 | |
| 7824 | 1 | |
| 6009.333333 | 1 | |
| 5963 | 1 | |
| 5197 | 1 | |
| 4300 | 1 | |
| 4282 | 1 |
| Distinct | 1172 |
|---|---|
| Distinct (%) | 20.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.26168455 |
| Minimum | 0.2 |
|---|---|
| Maximum | 1109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.2 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 7.25 |
| median | 15 |
| Q3 | 31 |
| 95-th percentile | 173 |
| Maximum | 1109 |
| Range | 1108.8 |
| Interquartile range (IQR) | 23.75 |
Descriptive statistics
| Standard deviation | 76.88203211 |
|---|---|
| Coefficient of variation (CV) | 2.063299957 |
| Kurtosis | 32.88890128 |
| Mean | 37.26168455 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 5.073511708 |
| Sum | 212205.2935 |
| Variance | 5910.846861 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 277 | 4.9% |
| 2 | 161 | 2.8% |
| 3 | 114 | 2.0% |
| 9 | 105 | 1.8% |
| 10 | 105 | 1.8% |
| 8 | 103 | 1.8% |
| 5 | 102 | 1.8% |
| 7 | 101 | 1.8% |
| 6 | 101 | 1.8% |
| 13 | 97 | 1.7% |
| Other values (1162) | 4429 |
| Value | Count | Frequency (%) |
| 0.2 | 1 | < 0.1% |
| 0.25 | 3 | 0.1% |
| 0.3333333333 | 7 | |
| 0.4 | 1 | < 0.1% |
| 0.4090909091 | 1 | < 0.1% |
| 0.5 | 12 | |
| 0.5454545455 | 1 | < 0.1% |
| 0.5555555556 | 1 | < 0.1% |
| 0.5714285714 | 1 | < 0.1% |
| 0.6176470588 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1109 | 1 | |
| 748 | 1 | |
| 730 | 1 | |
| 720 | 1 | |
| 703 | 1 | |
| 686 | 1 | |
| 675 | 1 | |
| 673 | 1 | |
| 660 | 1 | |
| 649 | 1 |
quantity_items_returned
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWEDZEROS| Distinct | 216 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 47.08446005 |
| Minimum | 0 |
|---|---|
| Maximum | 80995 |
| Zeros | 4190 |
| Zeros (%) | 73.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 39 |
| Maximum | 80995 |
| Range | 80995 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1474.760408 |
|---|---|
| Coefficient of variation (CV) | 31.32159541 |
| Kurtosis | 2718.145124 |
| Mean | 47.08446005 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 51.5242843 |
| Sum | 268146 |
| Variance | 2174918.261 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 1 | 169 | 3.0% |
| 2 | 150 | 2.6% |
| 3 | 105 | 1.8% |
| 4 | 89 | 1.6% |
| 6 | 78 | 1.4% |
| 5 | 61 | 1.1% |
| 12 | 52 | 0.9% |
| 7 | 44 | 0.8% |
| 8 | 43 | 0.8% |
| Other values (206) | 714 | 12.5% |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 1 | 169 | 3.0% |
| 2 | 150 | 2.6% |
| 3 | 105 | 1.8% |
| 4 | 89 | 1.6% |
| 5 | 61 | 1.1% |
| 6 | 78 | 1.4% |
| 7 | 44 | 0.8% |
| 8 | 43 | 0.8% |
| 9 | 41 | 0.7% |
| Value | Count | Frequency (%) |
| 80995 | 1 | |
| 74215 | 1 | |
| 9360 | 1 | |
| 9014 | 1 | |
| 8004 | 1 | |
| 4427 | 1 | |
| 3768 | 1 | |
| 3332 | 1 | |
| 2878 | 1 | |
| 2022 | 1 |
monetary_returned
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWEDZEROS| Distinct | 1087 |
|---|---|
| Distinct (%) | 19.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82.62016681 |
| Minimum | 0 |
|---|---|
| Maximum | 168469.6 |
| Zeros | 4190 |
| Zeros (%) | 73.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 3.925 |
| 95-th percentile | 107.4 |
| Maximum | 168469.6 |
| Range | 168469.6 |
| Interquartile range (IQR) | 3.925 |
Descriptive statistics
| Standard deviation | 2493.554888 |
|---|---|
| Coefficient of variation (CV) | 30.18094714 |
| Kurtosis | 3815.139979 |
| Mean | 82.62016681 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 59.48544078 |
| Sum | 470521.85 |
| Variance | 6217815.978 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 12.75 | 20 | 0.4% |
| 4.95 | 19 | 0.3% |
| 9.95 | 17 | 0.3% |
| 15 | 17 | 0.3% |
| 5.9 | 12 | 0.2% |
| 25.5 | 11 | 0.2% |
| 4.25 | 10 | 0.2% |
| 3.75 | 9 | 0.2% |
| 19.9 | 8 | 0.1% |
| Other values (1077) | 1382 | 24.3% |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 0.42 | 2 | < 0.1% |
| 0.65 | 1 | < 0.1% |
| 0.95 | 1 | < 0.1% |
| 1.25 | 4 | 0.1% |
| 1.45 | 4 | 0.1% |
| 1.64 | 1 | < 0.1% |
| 1.65 | 5 | 0.1% |
| 1.7 | 2 | < 0.1% |
| 1.79 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 168469.6 | 1 | |
| 77183.6 | 1 | |
| 22998.4 | 1 | |
| 14688.24 | 1 | |
| 8511.15 | 1 | |
| 7443.59 | 1 | |
| 5228.4 | 1 | |
| 4815.26 | 1 | |
| 4814.74 | 1 | |
| 4486.24 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| customer_id | revenue | recency | quantity_orders | quantity_items_purchased | avg_ticket | avg_recency | time_in_base | frequency | frequency_btwn_purchases | avg_basket_size | avg_unique_basked_size | quantity_items_returned | monetary_returned | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 17850 | 5391.21 | 372 | 34 | 1733 | 158.565000 | 186.500000 | 374 | 0.090909 | 17.000000 | 50.970588 | 0.617647 | 40 | 102.58 |
| 1 | 13047 | 3232.59 | 56 | 9 | 1390 | 359.176667 | 53.285714 | 374 | 0.024064 | 0.028302 | 154.444444 | 11.666667 | 35 | 143.49 |
| 2 | 12583 | 6705.38 | 2 | 15 | 5028 | 447.025333 | 24.866667 | 374 | 0.040107 | 0.040323 | 335.200000 | 7.600000 | 50 | 76.04 |
| 3 | 13748 | 948.25 | 95 | 5 | 439 | 189.650000 | 93.250000 | 374 | 0.013369 | 0.017921 | 87.800000 | 4.800000 | 0 | 0.00 |
| 4 | 15100 | 876.00 | 333 | 3 | 80 | 292.000000 | 124.333333 | 374 | 0.008021 | 0.073171 | 26.666667 | 0.333333 | 22 | 240.90 |
| 5 | 15291 | 4623.30 | 25 | 14 | 2102 | 330.235714 | 26.642857 | 374 | 0.037433 | 0.040115 | 150.142857 | 4.357143 | 29 | 71.79 |
| 6 | 14688 | 5630.87 | 7 | 21 | 3621 | 268.136667 | 18.650000 | 374 | 0.056150 | 0.057221 | 172.428571 | 7.047619 | 399 | 523.49 |
| 7 | 17809 | 5411.91 | 16 | 12 | 2057 | 450.992500 | 37.300000 | 374 | 0.032086 | 0.033520 | 171.416667 | 3.833333 | 41 | 67.06 |
| 8 | 15311 | 60767.90 | 0 | 91 | 38194 | 667.779121 | 4.144444 | 374 | 0.243316 | 0.243316 | 419.714286 | 6.230769 | 474 | 1348.56 |
| 9 | 16098 | 2005.63 | 87 | 7 | 613 | 286.518571 | 53.285714 | 374 | 0.018717 | 0.024390 | 87.571429 | 4.857143 | 0 | 0.00 |
Last rows
| customer_id | revenue | recency | quantity_orders | quantity_items_purchased | avg_ticket | avg_recency | time_in_base | frequency | frequency_btwn_purchases | avg_basket_size | avg_unique_basked_size | quantity_items_returned | monetary_returned | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5685 | 22695 | 6083.95 | 1 | 1 | 1852 | 6083.95 | 1.0 | 2 | 0.5 | 1.0 | 1852.0 | 675.0 | 0 | 0.0 |
| 5686 | 22696 | 7150.07 | 1 | 1 | 2150 | 7150.07 | 1.0 | 2 | 0.5 | 1.0 | 2150.0 | 748.0 | 0 | 0.0 |
| 5687 | 22699 | 3686.80 | 1 | 1 | 691 | 3686.80 | 1.0 | 2 | 0.5 | 1.0 | 691.0 | 203.0 | 0 | 0.0 |
| 5688 | 22700 | 4839.42 | 1 | 1 | 1074 | 4839.42 | 1.0 | 2 | 0.5 | 1.0 | 1074.0 | 55.0 | 0 | 0.0 |
| 5689 | 22704 | 17.90 | 1 | 1 | 14 | 17.90 | 1.0 | 2 | 0.5 | 1.0 | 14.0 | 7.0 | 0 | 0.0 |
| 5690 | 22705 | 3.35 | 1 | 1 | 2 | 3.35 | 1.0 | 2 | 0.5 | 1.0 | 2.0 | 2.0 | 0 | 0.0 |
| 5691 | 22706 | 5699.00 | 1 | 1 | 1747 | 5699.00 | 1.0 | 2 | 0.5 | 1.0 | 1747.0 | 634.0 | 0 | 0.0 |
| 5692 | 22707 | 6756.06 | 0 | 1 | 2010 | 6756.06 | 0.0 | 1 | 1.0 | 1.0 | 2010.0 | 730.0 | 0 | 0.0 |
| 5693 | 22708 | 3217.20 | 0 | 1 | 654 | 3217.20 | 0.0 | 1 | 1.0 | 1.0 | 654.0 | 56.0 | 0 | 0.0 |
| 5694 | 22709 | 3950.72 | 0 | 1 | 731 | 3950.72 | 0.0 | 1 | 1.0 | 1.0 | 731.0 | 217.0 | 0 | 0.0 |